Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 1121825 |
| Missing cells | 195100 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 136.9 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 45093 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
a is highly correlated with s | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 48775 (4.3%) missing values | Missing |
jerseyNumber has 48775 (4.3%) missing values | Missing |
o has 48775 (4.3%) missing values | Missing |
dir has 48775 (4.3%) missing values | Missing |
s has 69592 (6.2%) zeros | Zeros |
a has 65085 (5.8%) zeros | Zeros |
dis has 70615 (6.3%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 15:05:19.120450 |
|---|---|
| Analysis finished | 2022-11-02 15:06:59.168554 |
| Duration | 1 minute and 40.05 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021092594 |
| Minimum | 2021092300 |
|---|---|
| Maximum | 2021092700 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 2021092300 |
|---|---|
| 5-th percentile | 2021092300 |
| Q1 | 2021092602 |
| median | 2021092606 |
| Q3 | 2021092610 |
| 95-th percentile | 2021092700 |
| Maximum | 2021092700 |
| Range | 400 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 76.16287875 |
|---|---|
| Coefficient of variation (CV) | 3.768401258 × 10-8 |
| Kurtosis | 10.07169537 |
| Mean | 2021092594 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -3.159388939 |
| Sum | 2.2673122 × 1015 |
| Variance | 5800.7841 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021092610 | 93334 | 8.3% |
| 2021092604 | 83352 | 7.4% |
| 2021092600 | 79212 | 7.1% |
| 2021092605 | 77050 | 6.9% |
| 2021092606 | 74957 | 6.7% |
| 2021092602 | 74405 | 6.6% |
| 2021092611 | 73899 | 6.6% |
| 2021092700 | 67390 | 6.0% |
| 2021092607 | 66654 | 5.9% |
| 2021092601 | 65090 | 5.8% |
| Other values (6) | 366482 |
| Value | Count | Frequency (%) |
| 2021092300 | 64584 | |
| 2021092600 | 79212 | |
| 2021092601 | 65090 | |
| 2021092602 | 74405 | |
| 2021092603 | 60513 | |
| 2021092604 | 83352 | |
| 2021092605 | 77050 | |
| 2021092606 | 74957 | |
| 2021092607 | 66654 | |
| 2021092608 | 55867 |
| Value | Count | Frequency (%) |
| 2021092700 | 67390 | |
| 2021092613 | 64446 | |
| 2021092612 | 59754 | |
| 2021092611 | 73899 | |
| 2021092610 | 93334 | |
| 2021092609 | 61318 | |
| 2021092608 | 55867 | |
| 2021092607 | 66654 | |
| 2021092606 | 74957 | |
| 2021092605 | 77050 |
playId
Real number (ℝ≥0)
| Distinct | 1000 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2155.017796 |
| Minimum | 54 |
|---|---|
| Maximum | 4928 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 54 |
|---|---|
| 5-th percentile | 247 |
| Q1 | 1111 |
| median | 2173 |
| Q3 | 3174 |
| 95-th percentile | 4038 |
| Maximum | 4928 |
| Range | 4874 |
| Interquartile range (IQR) | 2063 |
Descriptive statistics
| Standard deviation | 1215.226313 |
|---|---|
| Coefficient of variation (CV) | 0.5639054655 |
| Kurtosis | -1.119652312 |
| Mean | 2155.017796 |
| Median Absolute Deviation (MAD) | 1029 |
| Skewness | 0.01527711883 |
| Sum | 2417552839 |
| Variance | 1476774.993 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2734 | 4669 | 0.4% |
| 100 | 4186 | 0.4% |
| 1547 | 4140 | 0.4% |
| 1174 | 3588 | 0.3% |
| 3544 | 3565 | 0.3% |
| 55 | 3335 | 0.3% |
| 1799 | 3266 | 0.3% |
| 232 | 3197 | 0.3% |
| 1536 | 3174 | 0.3% |
| 3610 | 3105 | 0.3% |
| Other values (990) | 1085600 |
| Value | Count | Frequency (%) |
| 54 | 2507 | |
| 55 | 3335 | |
| 75 | 1564 | |
| 76 | 1817 | |
| 77 | 690 | 0.1% |
| 78 | 828 | 0.1% |
| 79 | 1012 | 0.1% |
| 80 | 713 | 0.1% |
| 97 | 897 | 0.1% |
| 98 | 1541 |
| Value | Count | Frequency (%) |
| 4928 | 943 | |
| 4883 | 989 | |
| 4793 | 736 | |
| 4767 | 1012 | |
| 4694 | 759 | |
| 4670 | 1288 | |
| 4631 | 1058 | |
| 4583 | 1012 | |
| 4548 | 1380 | |
| 4526 | 966 |
| Distinct | 1162 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 48775 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45749.33891 |
| Minimum | 25511 |
|---|---|
| Maximum | 54038 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 37266 |
| Q1 | 42471 |
| median | 45281 |
| Q3 | 48027 |
| 95-th percentile | 53480 |
| Maximum | 54038 |
| Range | 28527 |
| Interquartile range (IQR) | 5556 |
Descriptive statistics
| Standard deviation | 4993.095966 |
|---|---|
| Coefficient of variation (CV) | 0.1091402867 |
| Kurtosis | 0.1305421809 |
| Mean | 45749.33891 |
| Median Absolute Deviation (MAD) | 2807 |
| Skewness | -0.2086622686 |
| Sum | 4.909132811 × 1010 |
| Variance | 24931007.33 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 47810 | 2309 | 0.2% |
| 47861 | 2309 | 0.2% |
| 42924 | 2309 | 0.2% |
| 52426 | 2309 | 0.2% |
| 52447 | 2309 | 0.2% |
| 53471 | 2309 | 0.2% |
| 43380 | 2309 | 0.2% |
| 53472 | 2220 | 0.2% |
| 44823 | 2152 | 0.2% |
| 53505 | 2152 | 0.2% |
| Other values (1152) | 1050363 | |
| (Missing) | 48775 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 2010 | |
| 28963 | 2112 | |
| 29550 | 1283 | |
| 29851 | 1043 | |
| 30842 | 169 | < 0.1% |
| 30869 | 1203 | |
| 33084 | 1620 | |
| 33107 | 1400 | |
| 33130 | 473 | < 0.1% |
| 33131 | 987 |
| Value | Count | Frequency (%) |
| 54038 | 46 | < 0.1% |
| 54006 | 628 | |
| 53957 | 1335 | |
| 53946 | 51 | < 0.1% |
| 53935 | 194 | < 0.1% |
| 53930 | 253 | < 0.1% |
| 53876 | 260 | < 0.1% |
| 53687 | 59 | < 0.1% |
| 53681 | 77 | < 0.1% |
| 53679 | 430 | < 0.1% |
| Distinct | 203 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.06156843 |
| Minimum | 1 |
|---|---|
| Maximum | 203 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 22 |
| Q3 | 33 |
| 95-th percentile | 53 |
| Maximum | 203 |
| Range | 202 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 17.5924006 |
|---|---|
| Coefficient of variation (CV) | 0.7311410583 |
| Kurtosis | 11.61714996 |
| Mean | 24.06156843 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 2.109047558 |
| Sum | 26992869 |
| Variance | 309.492559 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 26243 | 2.3% |
| 12 | 26243 | 2.3% |
| 21 | 26243 | 2.3% |
| 20 | 26243 | 2.3% |
| 19 | 26243 | 2.3% |
| 18 | 26243 | 2.3% |
| 17 | 26243 | 2.3% |
| 16 | 26243 | 2.3% |
| 15 | 26243 | 2.3% |
| 14 | 26243 | 2.3% |
| Other values (193) | 859395 |
| Value | Count | Frequency (%) |
| 1 | 26243 | |
| 2 | 26243 | |
| 3 | 26243 | |
| 4 | 26243 | |
| 5 | 26243 | |
| 6 | 26243 | |
| 7 | 26243 | |
| 8 | 26243 | |
| 9 | 26243 | |
| 10 | 26243 |
| Value | Count | Frequency (%) |
| 203 | 23 | |
| 202 | 23 | |
| 201 | 23 | |
| 200 | 23 | |
| 199 | 23 | |
| 198 | 23 | |
| 197 | 23 | |
| 196 | 23 | |
| 195 | 23 | |
| 194 | 23 |
| Distinct | 45093 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.6 MiB |
| 2021-09-26T18:49:46.600 | 92 |
|---|---|
| 2021-09-26T18:49:45.700 | 92 |
| 2021-09-26T18:49:46.700 | 92 |
| 2021-09-26T18:49:46.500 | 92 |
| 2021-09-26T18:49:46.400 | 92 |
| Other values (45088) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 25801975 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 2021-09-24T00:23:08.400 |
|---|---|
| 2nd row | 2021-09-24T00:23:08.500 |
| 3rd row | 2021-09-24T00:23:08.600 |
| 4th row | 2021-09-24T00:23:08.700 |
| 5th row | 2021-09-24T00:23:08.800 |
Common Values
| Value | Count | Frequency (%) |
| 2021-09-26T18:49:46.600 | 92 | < 0.1% |
| 2021-09-26T18:49:45.700 | 92 | < 0.1% |
| 2021-09-26T18:49:46.700 | 92 | < 0.1% |
| 2021-09-26T18:49:46.500 | 92 | < 0.1% |
| 2021-09-26T18:49:46.400 | 92 | < 0.1% |
| 2021-09-26T18:49:46.300 | 92 | < 0.1% |
| 2021-09-26T18:49:46.200 | 92 | < 0.1% |
| 2021-09-26T18:49:46.100 | 92 | < 0.1% |
| 2021-09-26T18:49:45.900 | 92 | < 0.1% |
| 2021-09-26T18:49:45.800 | 92 | < 0.1% |
| Other values (45083) | 1120905 |
Length
| Value | Count | Frequency (%) |
| 2021-09-26t18:49:46.600 | 92 | < 0.1% |
| 2021-09-26t18:49:45.600 | 92 | < 0.1% |
| 2021-09-26t18:49:45.700 | 92 | < 0.1% |
| 2021-09-26t18:49:44.800 | 92 | < 0.1% |
| 2021-09-26t18:49:44.900 | 92 | < 0.1% |
| 2021-09-26t18:49:45.000 | 92 | < 0.1% |
| 2021-09-26t18:49:45.100 | 92 | < 0.1% |
| 2021-09-26t18:49:45.200 | 92 | < 0.1% |
| 2021-09-26t18:49:45.300 | 92 | < 0.1% |
| 2021-09-26t18:49:45.400 | 92 | < 0.1% |
| Other values (45083) | 1120905 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5548083 | |
| 2 | 4550134 | |
| 1 | 2534807 | |
| - | 2243650 | |
| : | 2243650 | |
| 9 | 1679966 | 6.5% |
| 6 | 1267162 | 4.9% |
| T | 1121825 | 4.3% |
| . | 1121825 | 4.3% |
| 4 | 792097 | 3.1% |
| Other values (4) | 2698776 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19071025 | |
| Other Punctuation | 3365475 | 13.0% |
| Dash Punctuation | 2243650 | 8.7% |
| Uppercase Letter | 1121825 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5548083 | |
| 2 | 4550134 | |
| 1 | 2534807 | |
| 9 | 1679966 | 8.8% |
| 6 | 1267162 | 6.6% |
| 4 | 792097 | 4.2% |
| 3 | 783704 | 4.1% |
| 5 | 727030 | 3.8% |
| 7 | 595447 | 3.1% |
| 8 | 592595 | 3.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 2243650 | |
| . | 1121825 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2243650 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1121825 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 24680150 | |
| Latin | 1121825 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5548083 | |
| 2 | 4550134 | |
| 1 | 2534807 | |
| - | 2243650 | |
| : | 2243650 | |
| 9 | 1679966 | 6.8% |
| 6 | 1267162 | 5.1% |
| . | 1121825 | 4.5% |
| 4 | 792097 | 3.2% |
| 3 | 783704 | 3.2% |
| Other values (3) | 1915072 | 7.8% |
Latin
| Value | Count | Frequency (%) |
| T | 1121825 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 25801975 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5548083 | |
| 2 | 4550134 | |
| 1 | 2534807 | |
| - | 2243650 | |
| : | 2243650 | |
| 9 | 1679966 | 6.5% |
| 6 | 1267162 | 4.9% |
| T | 1121825 | 4.3% |
| . | 1121825 | 4.3% |
| 4 | 792097 | 3.1% |
| Other values (4) | 2698776 |
| Distinct | 99 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 48775 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.36259913 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 22 |
| median | 52 |
| Q3 | 75 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 53 |
Descriptive statistics
| Standard deviation | 30.03068304 |
|---|---|
| Coefficient of variation (CV) | 0.6083691614 |
| Kurtosis | -1.351223707 |
| Mean | 49.36259913 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.05531355981 |
| Sum | 52968537 |
| Variance | 901.8419237 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 23753 | 2.1% |
| 11 | 22666 | 2.0% |
| 76 | 21637 | 1.9% |
| 24 | 21452 | 1.9% |
| 26 | 21168 | 1.9% |
| 21 | 20328 | 1.8% |
| 97 | 18448 | 1.6% |
| 22 | 18255 | 1.6% |
| 25 | 18168 | 1.6% |
| 74 | 17811 | 1.6% |
| Other values (89) | 869364 | |
| (Missing) | 48775 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 12799 | |
| 2 | 17746 | |
| 3 | 8424 | |
| 4 | 10856 | |
| 5 | 6116 | 0.5% |
| 6 | 9145 | |
| 7 | 9385 | |
| 8 | 11150 | |
| 9 | 7668 | |
| 10 | 15901 |
| Value | Count | Frequency (%) |
| 99 | 15805 | |
| 98 | 13902 | |
| 97 | 18448 | |
| 96 | 11776 | |
| 95 | 9901 | |
| 94 | 16225 | |
| 93 | 9563 | |
| 92 | 6501 | 0.6% |
| 91 | 16844 | |
| 90 | 14304 |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.6 MiB |
| football | 48775 |
|---|---|
| MIA | 44638 |
| LV | 44638 |
| LAC | 39864 |
| KC | 39864 |
| Other values (28) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 2.958409734 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3318818 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | HOU |
|---|---|
| 2nd row | HOU |
| 3rd row | HOU |
| 4th row | HOU |
| 5th row | HOU |
Common Values
| Value | Count | Frequency (%) |
| football | 48775 | 4.3% |
| MIA | 44638 | 4.0% |
| LV | 44638 | 4.0% |
| LAC | 39864 | 3.6% |
| KC | 39864 | 3.6% |
| BUF | 37884 | 3.4% |
| WAS | 37884 | 3.4% |
| NE | 36850 | 3.3% |
| NO | 36850 | 3.3% |
| NYG | 35849 | 3.2% |
| Other values (23) | 718729 |
Length
| Value | Count | Frequency (%) |
| football | 48775 | 4.3% |
| mia | 44638 | 4.0% |
| lv | 44638 | 4.0% |
| lac | 39864 | 3.6% |
| kc | 39864 | 3.6% |
| buf | 37884 | 3.4% |
| was | 37884 | 3.4% |
| ne | 36850 | 3.3% |
| no | 36850 | 3.3% |
| nyg | 35849 | 3.2% |
| Other values (23) | 718729 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 378741 | 11.4% |
| N | 282095 | 8.5% |
| I | 255992 | 7.7% |
| L | 254639 | 7.7% |
| C | 204754 | 6.2% |
| E | 188188 | 5.7% |
| T | 165374 | 5.0% |
| B | 139634 | 4.2% |
| D | 123860 | 3.7% |
| o | 97550 | 2.9% |
| Other values (20) | 1227991 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2928618 | |
| Lowercase Letter | 390200 | 11.8% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 378741 | |
| N | 282095 | 9.6% |
| I | 255992 | 8.7% |
| L | 254639 | 8.7% |
| C | 204754 | 7.0% |
| E | 188188 | 6.4% |
| T | 165374 | 5.6% |
| B | 139634 | 4.8% |
| D | 123860 | 4.2% |
| S | 97284 | 3.3% |
| Other values (14) | 838057 |
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 97550 | |
| l | 97550 | |
| f | 48775 | |
| a | 48775 | |
| b | 48775 | |
| t | 48775 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3318818 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 378741 | 11.4% |
| N | 282095 | 8.5% |
| I | 255992 | 7.7% |
| L | 254639 | 7.7% |
| C | 204754 | 6.2% |
| E | 188188 | 5.7% |
| T | 165374 | 5.0% |
| B | 139634 | 4.2% |
| D | 123860 | 3.7% |
| o | 97550 | 2.9% |
| Other values (20) | 1227991 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3318818 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 378741 | 11.4% |
| N | 282095 | 8.5% |
| I | 255992 | 7.7% |
| L | 254639 | 7.7% |
| C | 204754 | 6.2% |
| E | 188188 | 5.7% |
| T | 165374 | 5.0% |
| B | 139634 | 4.2% |
| D | 123860 | 3.7% |
| o | 97550 | 2.9% |
| Other values (20) | 1227991 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.6 MiB |
| left | |
|---|---|
| right |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.455663762 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4998475 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| left | 610650 | |
| right | 511175 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| left | 610650 | |
| right | 511175 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 1121825 | |
| l | 610650 | |
| e | 610650 | |
| f | 610650 | |
| r | 511175 | |
| i | 511175 | |
| g | 511175 | |
| h | 511175 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4998475 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 1121825 | |
| l | 610650 | |
| e | 610650 | |
| f | 610650 | |
| r | 511175 | |
| i | 511175 | |
| g | 511175 | |
| h | 511175 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4998475 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 1121825 | |
| l | 610650 | |
| e | 610650 | |
| f | 610650 | |
| r | 511175 | |
| i | 511175 | |
| g | 511175 | |
| h | 511175 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4998475 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 1121825 | |
| l | 610650 | |
| e | 610650 | |
| f | 610650 | |
| r | 511175 | |
| i | 511175 | |
| g | 511175 | |
| h | 511175 |
x
Real number (ℝ)
| Distinct | 11794 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.24778826 |
| Minimum | -3.55 |
|---|---|
| Maximum | 119.73 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 40 |
| Negative (%) | < 0.1% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | -3.55 |
|---|---|
| 5-th percentile | 19.27 |
| Q1 | 39.59 |
| median | 58.8 |
| Q3 | 78.45 |
| 95-th percentile | 100.49 |
| Maximum | 119.73 |
| Range | 123.28 |
| Interquartile range (IQR) | 38.86 |
Descriptive statistics
| Standard deviation | 24.83586816 |
|---|---|
| Coefficient of variation (CV) | 0.4191864184 |
| Kurtosis | -0.8132372129 |
| Mean | 59.24778826 |
| Median Absolute Deviation (MAD) | 19.42 |
| Skewness | 0.05339164463 |
| Sum | 66465650.07 |
| Variance | 616.8203474 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 53.82 | 238 | < 0.1% |
| 69.61 | 232 | < 0.1% |
| 72.12 | 216 | < 0.1% |
| 33.7 | 213 | < 0.1% |
| 55.19 | 212 | < 0.1% |
| 55.16 | 205 | < 0.1% |
| 54.83 | 201 | < 0.1% |
| 51.02 | 201 | < 0.1% |
| 69.51 | 200 | < 0.1% |
| 69.6 | 198 | < 0.1% |
| Other values (11784) | 1119709 |
| Value | Count | Frequency (%) |
| -3.55 | 1 | |
| -3.54 | 1 | |
| -3.53 | 1 | |
| -3.5 | 1 | |
| -3.48 | 1 | |
| -3.46 | 1 | |
| -3.4 | 2 | |
| -3.33 | 1 | |
| -3.28 | 1 | |
| -3.25 | 1 |
| Value | Count | Frequency (%) |
| 119.73 | 2 | |
| 119.72 | 4 | |
| 119.7 | 2 | |
| 119.69 | 2 | |
| 119.66 | 3 | |
| 119.63 | 1 | < 0.1% |
| 119.62 | 2 | |
| 119.61 | 3 | |
| 119.6 | 4 | |
| 119.59 | 1 | < 0.1% |
y
Real number (ℝ)
| Distinct | 5400 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.62128333 |
| Minimum | -1.8 |
|---|---|
| Maximum | 56.32 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 61 |
| Negative (%) | < 0.1% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | -1.8 |
|---|---|
| 5-th percentile | 11.24 |
| Q1 | 21.83 |
| median | 26.59 |
| Q3 | 31.4 |
| 95-th percentile | 42.05 |
| Maximum | 56.32 |
| Range | 58.12 |
| Interquartile range (IQR) | 9.57 |
Descriptive statistics
| Standard deviation | 8.363867646 |
|---|---|
| Coefficient of variation (CV) | 0.3141797314 |
| Kurtosis | 0.3265031168 |
| Mean | 26.62128333 |
| Median Absolute Deviation (MAD) | 4.79 |
| Skewness | 0.0191207113 |
| Sum | 29864421.17 |
| Variance | 69.954282 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23.86 | 1186 | 0.1% |
| 23.85 | 1133 | 0.1% |
| 23.82 | 1116 | 0.1% |
| 23.77 | 1111 | 0.1% |
| 23.76 | 1104 | 0.1% |
| 23.8 | 1103 | 0.1% |
| 23.87 | 1083 | 0.1% |
| 23.84 | 1073 | 0.1% |
| 23.81 | 1070 | 0.1% |
| 23.79 | 1063 | 0.1% |
| Other values (5390) | 1110783 |
| Value | Count | Frequency (%) |
| -1.8 | 1 | |
| -1.75 | 1 | |
| -1.65 | 1 | |
| -1.62 | 1 | |
| -1.51 | 1 | |
| -1.48 | 1 | |
| -1.47 | 1 | |
| -1.31 | 2 | |
| -1.28 | 1 | |
| -1.12 | 2 |
| Value | Count | Frequency (%) |
| 56.32 | 1 | |
| 55.72 | 1 | |
| 55.12 | 1 | |
| 55.09 | 1 | |
| 55.02 | 1 | |
| 54.89 | 1 | |
| 54.8 | 1 | |
| 54.75 | 1 | |
| 54.59 | 1 | |
| 54.46 | 1 |
| Distinct | 2180 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.579950625 |
| Minimum | 0 |
|---|---|
| Maximum | 28.17 |
| Zeros | 69592 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.75 |
| median | 2.12 |
| Q3 | 3.81 |
| 95-th percentile | 6.8 |
| Maximum | 28.17 |
| Range | 28.17 |
| Interquartile range (IQR) | 3.06 |
Descriptive statistics
| Standard deviation | 2.396155797 |
|---|---|
| Coefficient of variation (CV) | 0.928760331 |
| Kurtosis | 14.4549856 |
| Mean | 2.579950625 |
| Median Absolute Deviation (MAD) | 1.49 |
| Skewness | 2.366121315 |
| Sum | 2894253.11 |
| Variance | 5.741562602 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 69592 | 6.2% |
| 0.01 | 16384 | 1.5% |
| 0.02 | 9518 | 0.8% |
| 0.03 | 7175 | 0.6% |
| 0.04 | 5997 | 0.5% |
| 0.05 | 5123 | 0.5% |
| 0.06 | 4794 | 0.4% |
| 0.07 | 4548 | 0.4% |
| 0.08 | 4232 | 0.4% |
| 0.09 | 3902 | 0.3% |
| Other values (2170) | 990560 |
| Value | Count | Frequency (%) |
| 0 | 69592 | |
| 0.01 | 16384 | 1.5% |
| 0.02 | 9518 | 0.8% |
| 0.03 | 7175 | 0.6% |
| 0.04 | 5997 | 0.5% |
| 0.05 | 5123 | 0.5% |
| 0.06 | 4794 | 0.4% |
| 0.07 | 4548 | 0.4% |
| 0.08 | 4232 | 0.4% |
| 0.09 | 3902 | 0.3% |
| Value | Count | Frequency (%) |
| 28.17 | 1 | |
| 28.07 | 1 | |
| 28.06 | 1 | |
| 27.92 | 1 | |
| 27.82 | 1 | |
| 27.75 | 1 | |
| 27.65 | 1 | |
| 27.48 | 1 | |
| 27.47 | 1 | |
| 27.34 | 1 |
| Distinct | 1535 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.781462144 |
| Minimum | 0 |
|---|---|
| Maximum | 27.8 |
| Zeros | 65085 |
| Zeros (%) | 5.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.71 |
| median | 1.53 |
| Q3 | 2.57 |
| 95-th percentile | 4.45 |
| Maximum | 27.8 |
| Range | 27.8 |
| Interquartile range (IQR) | 1.86 |
Descriptive statistics
| Standard deviation | 1.429146826 |
|---|---|
| Coefficient of variation (CV) | 0.8022324977 |
| Kurtosis | 5.958978856 |
| Mean | 1.781462144 |
| Median Absolute Deviation (MAD) | 0.91 |
| Skewness | 1.405267039 |
| Sum | 1998488.77 |
| Variance | 2.042460649 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 65085 | 5.8% |
| 0.01 | 12683 | 1.1% |
| 0.02 | 7332 | 0.7% |
| 0.03 | 5547 | 0.5% |
| 0.04 | 4592 | 0.4% |
| 0.05 | 4143 | 0.4% |
| 1.06 | 3640 | 0.3% |
| 1.18 | 3622 | 0.3% |
| 0.96 | 3576 | 0.3% |
| 0.06 | 3562 | 0.3% |
| Other values (1525) | 1008043 |
| Value | Count | Frequency (%) |
| 0 | 65085 | |
| 0.01 | 12683 | 1.1% |
| 0.02 | 7332 | 0.7% |
| 0.03 | 5547 | 0.5% |
| 0.04 | 4592 | 0.4% |
| 0.05 | 4143 | 0.4% |
| 0.06 | 3562 | 0.3% |
| 0.07 | 3219 | 0.3% |
| 0.08 | 2885 | 0.3% |
| 0.09 | 2579 | 0.2% |
| Value | Count | Frequency (%) |
| 27.8 | 1 | |
| 27.77 | 1 | |
| 27.75 | 1 | |
| 27.36 | 1 | |
| 27.27 | 1 | |
| 25.75 | 1 | |
| 24.85 | 1 | |
| 24.32 | 1 | |
| 24.19 | 1 | |
| 24.09 | 1 |
| Distinct | 547 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2612968957 |
| Minimum | 0 |
|---|---|
| Maximum | 7.42 |
| Zeros | 70615 |
| Zeros (%) | 6.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.21 |
| Q3 | 0.38 |
| 95-th percentile | 0.68 |
| Maximum | 7.42 |
| Range | 7.42 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.2564520855 |
|---|---|
| Coefficient of variation (CV) | 0.9814586004 |
| Kurtosis | 49.41449584 |
| Mean | 0.2612968957 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.196383183 |
| Sum | 293129.39 |
| Variance | 0.06576767217 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 70615 | 6.3% |
| 0.01 | 59895 | 5.3% |
| 0.02 | 34957 | 3.1% |
| 0.03 | 26970 | 2.4% |
| 0.04 | 23688 | 2.1% |
| 0.05 | 21702 | 1.9% |
| 0.18 | 20885 | 1.9% |
| 0.2 | 20685 | 1.8% |
| 0.17 | 20682 | 1.8% |
| 0.19 | 20665 | 1.8% |
| Other values (537) | 801081 |
| Value | Count | Frequency (%) |
| 0 | 70615 | |
| 0.01 | 59895 | |
| 0.02 | 34957 | |
| 0.03 | 26970 | 2.4% |
| 0.04 | 23688 | 2.1% |
| 0.05 | 21702 | 1.9% |
| 0.06 | 20523 | 1.8% |
| 0.07 | 20322 | 1.8% |
| 0.08 | 19807 | 1.8% |
| 0.09 | 19766 | 1.8% |
| Value | Count | Frequency (%) |
| 7.42 | 1 | |
| 7.16 | 1 | |
| 6.99 | 1 | |
| 6.53 | 1 | |
| 6.48 | 1 | |
| 6.33 | 1 | |
| 6.32 | 1 | |
| 6.04 | 1 | |
| 5.98 | 1 | |
| 5.95 | 1 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 48775 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 178.8167407 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 3 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 29.57 |
| Q1 | 88.77 |
| median | 177.93 |
| Q3 | 268.4 |
| 95-th percentile | 329.72 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.63 |
Descriptive statistics
| Standard deviation | 99.32762729 |
|---|---|
| Coefficient of variation (CV) | 0.5554716348 |
| Kurtosis | -1.355980466 |
| Mean | 178.8167407 |
| Median Absolute Deviation (MAD) | 89.82 |
| Skewness | 0.01258116145 |
| Sum | 191879303.6 |
| Variance | 9865.977542 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 1257 | 0.1% |
| 87.34 | 111 | < 0.1% |
| 89 | 111 | < 0.1% |
| 266.42 | 101 | < 0.1% |
| 95.23 | 100 | < 0.1% |
| 85.48 | 98 | < 0.1% |
| 262.74 | 98 | < 0.1% |
| 258.31 | 98 | < 0.1% |
| 259.6 | 98 | < 0.1% |
| 90.81 | 98 | < 0.1% |
| Other values (35991) | 1070880 | |
| (Missing) | 48775 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 3 | < 0.1% |
| 0.01 | 15 | |
| 0.02 | 21 | |
| 0.03 | 12 | |
| 0.04 | 13 | |
| 0.05 | 16 | |
| 0.06 | 15 | |
| 0.07 | 22 | |
| 0.08 | 24 | |
| 0.09 | 15 |
| Value | Count | Frequency (%) |
| 360 | 11 | < 0.1% |
| 359.99 | 11 | < 0.1% |
| 359.98 | 10 | < 0.1% |
| 359.97 | 16 | |
| 359.96 | 17 | |
| 359.95 | 11 | < 0.1% |
| 359.94 | 30 | |
| 359.93 | 17 | |
| 359.92 | 22 | |
| 359.91 | 21 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 48775 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.5803742 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 13 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 24.53 |
| Q1 | 90.99 |
| median | 180.51 |
| Q3 | 270.5 |
| 95-th percentile | 336.23 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.51 |
Descriptive statistics
| Standard deviation | 100.7337335 |
|---|---|
| Coefficient of variation (CV) | 0.5578332306 |
| Kurtosis | -1.285046983 |
| Mean | 180.5803742 |
| Median Absolute Deviation (MAD) | 89.75 |
| Skewness | -0.001276817356 |
| Sum | 193771770.6 |
| Variance | 10147.28507 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 92.43 | 73 | < 0.1% |
| 274.3 | 73 | < 0.1% |
| 271.78 | 72 | < 0.1% |
| 269.43 | 72 | < 0.1% |
| 270.61 | 72 | < 0.1% |
| 91.51 | 71 | < 0.1% |
| 87.02 | 71 | < 0.1% |
| 88.1 | 71 | < 0.1% |
| 91.33 | 71 | < 0.1% |
| 270.49 | 70 | < 0.1% |
| Other values (35991) | 1072334 | |
| (Missing) | 48775 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 13 | |
| 0.01 | 17 | |
| 0.02 | 27 | |
| 0.03 | 23 | |
| 0.04 | 26 | |
| 0.05 | 21 | |
| 0.06 | 25 | |
| 0.07 | 28 | |
| 0.08 | 16 | |
| 0.09 | 28 |
| Value | Count | Frequency (%) |
| 360 | 10 | < 0.1% |
| 359.99 | 31 | |
| 359.98 | 32 | |
| 359.97 | 28 | |
| 359.96 | 28 | |
| 359.95 | 24 | |
| 359.94 | 14 | |
| 359.93 | 23 | |
| 359.92 | 22 | |
| 359.91 | 21 |
event
Categorical
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.6 MiB |
| None | |
|---|---|
| ball_snap | 26197 |
| pass_forward | 22701 |
| autoevent_ballsnap | 11638 |
| autoevent_passforward | 11454 |
| Other values (15) | 12719 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.669379805 |
| Min length | 3 |
Characters and Unicode
| Total characters | 5238227 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 1037116 | |
| ball_snap | 26197 | 2.3% |
| pass_forward | 22701 | 2.0% |
| autoevent_ballsnap | 11638 | 1.0% |
| autoevent_passforward | 11454 | 1.0% |
| play_action | 5405 | 0.5% |
| run | 1725 | 0.2% |
| qb_sack | 1518 | 0.1% |
| pass_arrived | 1127 | 0.1% |
| autoevent_passinterrupted | 644 | 0.1% |
| Other values (10) | 2300 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 1037116 | |
| ball_snap | 26197 | 2.3% |
| pass_forward | 22701 | 2.0% |
| autoevent_ballsnap | 11638 | 1.0% |
| autoevent_passforward | 11454 | 1.0% |
| play_action | 5405 | 0.5% |
| run | 1725 | 0.2% |
| qb_sack | 1518 | 0.1% |
| pass_arrived | 1127 | 0.1% |
| autoevent_passinterrupted | 644 | 0.1% |
| Other values (10) | 2300 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1108968 | |
| o | 1102597 | |
| e | 1088958 | |
| N | 1037116 | |
| a | 184414 | 3.5% |
| s | 113482 | 2.2% |
| _ | 83559 | 1.6% |
| l | 81765 | 1.6% |
| p | 80937 | 1.5% |
| r | 73991 | 1.4% |
| Other values (15) | 282440 | 5.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4117552 | |
| Uppercase Letter | 1037116 | 19.8% |
| Connector Punctuation | 83559 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1108968 | |
| o | 1102597 | |
| e | 1088958 | |
| a | 184414 | 4.5% |
| s | 113482 | 2.8% |
| l | 81765 | 2.0% |
| p | 80937 | 2.0% |
| r | 73991 | 1.8% |
| t | 56971 | 1.4% |
| b | 39629 | 1.0% |
| Other values (13) | 185840 | 4.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 1037116 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 83559 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5154668 | |
| Common | 83559 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1108968 | |
| o | 1102597 | |
| e | 1088958 | |
| N | 1037116 | |
| a | 184414 | 3.6% |
| s | 113482 | 2.2% |
| l | 81765 | 1.6% |
| p | 80937 | 1.6% |
| r | 73991 | 1.4% |
| t | 56971 | 1.1% |
| Other values (14) | 225469 | 4.4% |
Common
| Value | Count | Frequency (%) |
| _ | 83559 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5238227 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1108968 | |
| o | 1102597 | |
| e | 1088958 | |
| N | 1037116 | |
| a | 184414 | 3.5% |
| s | 113482 | 2.2% |
| _ | 83559 | 1.6% |
| l | 81765 | 1.6% |
| p | 80937 | 1.5% |
| r | 73991 | 1.4% |
| Other values (15) | 282440 | 5.4% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021092300 | 54 | 41300.0 | 1 | 2021-09-24T00:23:08.400 | 58.0 | HOU | right | 38.66 | 28.98 | 0.00 | 0.00 | 0.00 | 259.88 | 205.34 | None |
| 1 | 2021092300 | 54 | 41300.0 | 2 | 2021-09-24T00:23:08.500 | 58.0 | HOU | right | 38.66 | 28.98 | 0.00 | 0.00 | 0.01 | 259.88 | 197.10 | None |
| 2 | 2021092300 | 54 | 41300.0 | 3 | 2021-09-24T00:23:08.600 | 58.0 | HOU | right | 38.66 | 28.97 | 0.00 | 0.00 | 0.00 | 259.88 | 192.98 | None |
| 3 | 2021092300 | 54 | 41300.0 | 4 | 2021-09-24T00:23:08.700 | 58.0 | HOU | right | 38.66 | 28.97 | 0.02 | 0.34 | 0.00 | 259.88 | 181.68 | None |
| 4 | 2021092300 | 54 | 41300.0 | 5 | 2021-09-24T00:23:08.800 | 58.0 | HOU | right | 38.66 | 28.97 | 0.05 | 0.37 | 0.00 | 260.78 | 199.16 | None |
| 5 | 2021092300 | 54 | 41300.0 | 6 | 2021-09-24T00:23:08.900 | 58.0 | HOU | right | 38.65 | 28.97 | 0.10 | 0.62 | 0.01 | 262.87 | 271.88 | autoevent_ballsnap |
| 6 | 2021092300 | 54 | 41300.0 | 7 | 2021-09-24T00:23:09.000 | 58.0 | HOU | right | 38.63 | 28.98 | 0.25 | 1.14 | 0.02 | 264.34 | 287.33 | None |
| 7 | 2021092300 | 54 | 41300.0 | 8 | 2021-09-24T00:23:09.100 | 58.0 | HOU | right | 38.59 | 29.00 | 0.48 | 1.73 | 0.05 | 265.65 | 293.21 | ball_snap |
| 8 | 2021092300 | 54 | 41300.0 | 9 | 2021-09-24T00:23:09.200 | 58.0 | HOU | right | 38.52 | 29.02 | 0.75 | 2.47 | 0.07 | 266.51 | 293.93 | None |
| 9 | 2021092300 | 54 | 41300.0 | 10 | 2021-09-24T00:23:09.300 | 58.0 | HOU | right | 38.42 | 29.06 | 1.17 | 2.82 | 0.11 | 268.87 | 290.62 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1121815 | 2021092700 | 4156 | NaN | 38 | 2021-09-28T03:24:35.400 | NaN | football | left | 76.23 | 26.88 | 4.72 | 3.01 | 0.43 | NaN | NaN | None |
| 1121816 | 2021092700 | 4156 | NaN | 39 | 2021-09-28T03:24:35.500 | NaN | football | left | 75.88 | 27.36 | 5.13 | 0.92 | 0.59 | NaN | NaN | None |
| 1121817 | 2021092700 | 4156 | NaN | 40 | 2021-09-28T03:24:35.600 | NaN | football | left | 75.54 | 27.90 | 5.62 | 1.41 | 0.63 | NaN | NaN | None |
| 1121818 | 2021092700 | 4156 | NaN | 41 | 2021-09-28T03:24:35.700 | NaN | football | left | 75.20 | 28.42 | 5.54 | 1.04 | 0.63 | NaN | NaN | None |
| 1121819 | 2021092700 | 4156 | NaN | 42 | 2021-09-28T03:24:35.800 | NaN | football | left | 74.90 | 28.86 | 5.11 | 2.07 | 0.53 | NaN | NaN | run |
| 1121820 | 2021092700 | 4156 | NaN | 43 | 2021-09-28T03:24:35.900 | NaN | football | left | 74.61 | 29.30 | 5.11 | 4.88 | 0.52 | NaN | NaN | None |
| 1121821 | 2021092700 | 4156 | NaN | 44 | 2021-09-28T03:24:36.000 | NaN | football | left | 74.35 | 29.71 | 4.66 | 5.33 | 0.49 | NaN | NaN | None |
| 1121822 | 2021092700 | 4156 | NaN | 45 | 2021-09-28T03:24:36.100 | NaN | football | left | 74.12 | 30.09 | 4.16 | 4.69 | 0.44 | NaN | NaN | None |
| 1121823 | 2021092700 | 4156 | NaN | 46 | 2021-09-28T03:24:36.200 | NaN | football | left | 73.15 | 30.52 | 5.15 | 3.22 | 1.06 | NaN | NaN | None |
| 1121824 | 2021092700 | 4156 | NaN | 47 | 2021-09-28T03:24:36.300 | NaN | football | left | 72.35 | 30.84 | 5.23 | 3.34 | 0.86 | NaN | NaN | None |